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Abstract 

In this paper we propose a method to estimate the density matrix p 
of a d-level quantum system by measurements on the A'^-fold system in 
the joint state . The scheme is based on covariant observables and 
representation theory of unitary groups and it extends previous results 
concerning pure states and the estimation of the spectrum of p. We show 
that it is consistent (i.e. the original input state p is recovered with cer- 
tainty if A'' ^ oo), analyze its large deviation behavior, and calculate 
explicitly the corresponding rate function which describes the exponential 
decrease of error probabilities in the limit A'" — > oo. Finally we discuss the 
question whether the proposed scheme provides the fastest possible decay 
of error probabilities. 

1 Introduction 

The density operator p of a d-level quantum system (d € N) describes the 
preparation of the system in all details relevant to statistical experiments, and 
the task of quantum state estimation is to determine p by measurements on a 
(possibly large) number TV of systems, which are all prepared according to p. In 
the limit of infinitely many input systems it is of course possible to get exact 
estimates. If N remains finite, however, estimation errors are unavoidable. The 
best we can get (if N is large enough) is an estimation scheme which produces 
only small errors or, better to say, which produces large errors only with a small 
probability. 

There are several ways to get "good" estimation schemes. One possibility 
is to choose an appropriate figure of merit which measures the quality of the 
estimates (e.g. averaged fidelities with respect to the original density matrix) and 
to solve the corresponding optimization problem. If we know a priori that the 
input state p is pure (but otherwise unknown) this approach is very successful 
and leads to optimal estimators, which can be given in closed form for all finite 
values of iV; cf. e.g. |21 EOl El El HI EHl El- In the general case, however (i.e. 
if nothing is known about p) the situation is much more difficult. First of all 
the result depends much more on the figure of merit chosen than in the pure 
state case, and even if we have found an appropriate quality criterion it is in 
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general very hard to determine the corresponding optimal estimator explicitly 
for arbitrary N; some results related to this approach can be found in |36[I15[ I^. 

A way out of this dilemma, is to neglect the quality of the estimates for 
finite N and to look for estimation schemes which guarantee at least that error 
probabilities vanish "as fast as possible" as TV goes to infinity (cf. |22[ I24j : for 
a collection of recent publications on the subject see also ^Hl)- There are two 
approaches which implement this somewhat vague idea in a mathematically 
exact way. One possibility is to look at variances (rescaled by A'') in the limit 
N ^ oo. This is done in several works (cf. e.g. and in particular the 

papers reprinted in |T2|) and it leads to quantum analogs of classical Cramer- 
Rao type bounds. The second idea is to analyze the large deviation behavior 
of the estimators. To make this more precise let us denote an estimate derived 
from a measurement on N systems in the joint state p®^ by a. Then we can 
look at the probability P/v.e that the trace-norm distance between p and a (or 
any other appropriate distance measure for states) is at least e, i.e. ||/9 — cr||i > e. 
Since p — a would be the exact estimate this is clearly an error probability. Now 
we are interested in those cases where PN,e vanishes exponentially fast in N , i.e. 



Here Cjv, A^ G N is an unknown sequence of positive real numbers, growing at 
most subexponentially with N (and which is of no interest for the following), 
and /(p, a) is a positive function which vanishes iS a — p holds. / is called the 
rate function because it describes the exponential rate with which estimation 
errors vanishes asymptotically. In classical statistics this analysis was initiated 
by Bahadur |31 ^ |S] and has become in the mean time a classical topic ( "Ba- 
hadur efficiency"). About the quantum case, however, much less is known, and 
the results available so far cover three different areas: 1. In [571^1^ an ex- 
plicit scheme to estimate the spectrum of p is proposed and its rate function 
is calculated. The latter is shown to be optimal in [21]. 2. The rate function of 
the optimal pure state estimator is calculated in U^l- 3. In [201 the behavior of 
quantities like lime_,o inf||p„o.||j>£ I{p, a) is analyzed for one-parameter families 
of states, and the relation to quantum Fisher information is discussed. 

The purpose of the present paper is to extend the results about the spectrum 
in 123 and about pure states in ^Hl in two respects. Firstly, we will propose a 
scheme to estimate the full density matrix which is based on covariant observ- 
ables PH and which reduces to |27] if we look only at the spectrum of p. And 
secondly, we will pose the question whether the proposed scheme is "asymptot- 
ically optimal" , i.e. whether its rate function is bigger than the rate function of 
any other scheme. There is of course no guarantee that a given set of functions 
admits a maximal element, but in the classical case it is known that such an 
"optimal rate function" exists (and is given by the classical relative entropy - 
this is again a consequence of Bahadur's work PlEIISl)- For quantum systems, 
however, the situation is - not very surprisingly - much more difficult. 

The outline of the paper is as follows: In Section |21 we will give a more 
formal introduction to the questions we are considering and in|21we will state 
our main results. The proofs and a more detailed discussion is then distributed 
among Section^ (were we will consider U((i)-covariant estimation schemes) and 
Section [S] (where upper bounds on rate functions will be discussed). 
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2 Basic definitions 



In this section we will present some mathematical preliminaries (in particular 
basic definitions and terminology) concerning quantum state estimation. A short 
summary of material from the theory of large deviations used throughout this 
paper can be found in Appendix 1X1 

2.1 State estimation 

Let us consider the cZ-dimensional Hilbert space Ti = and the corresponding 
set S of density operators. The task of quantum state estimation is to determine 
a state p € 5 by a measurement on an iV-fold system, which is prepared in 
the joint state p®^. Mathematically this can be described by a normalized 
POV measure on the state space S with values in the algebra B{Ti.^'^) of 
(bounded) operators on . More precisely, is a (strongly) a -additive set 
function 

En ■■ 2(5) ^ B{H^^) with En{A) > 0, En{^) = 0, En{S) = I, (2) 

on the Borel a algebra S(5) of S, and the probability to get an estimate in a 
Borel set A C 5 is given by 

/iAr^p(A) =tr(p«^i?Ar(A)). (3) 

Since the number N of systems is arbitrary, wc need a whole sequence of ob- 
servables and we will call each such sequence in the following a full estimation 
scheme. For a good estimation scheme the quality of the estimates should in- 
crease with N, i.e. the error probability should decrease and in the limit of 
infinitely many input systems the estimate should be exact; in other words the 
sequence of probability measures {fJ.N,p)N<£N should converge for each p weakly 
to the point measure concentrated at p. Such an estimation scheme is called 
consistent. 

If we are interested not in the whole state but only in some special properties 
of p (e.g. its von Neumann entropy), described by a function S 3 p t—^ p{p) € X 
taking its values in a locally compact, separable metric space X we have to 
consider more generally POV measures E^ : 53 (X) B{Ti®^) on X instead 
of 5. As before, iv{p'^^ En{/^)) is the probability to get an estimate in A C 
Estimating the spectrum of a density operator is a particular example of this 
kind. In this case p coincides with 

d 

s : 5 ^ S = {a; e [0, l]'^ I xi > • • • > > 0, ^ = 1} (4) 

i=i 

which maps a density operator p to its spectrum s[p) G S, i.e. Sj{p) = (XjTPXj) 
where Xi; ■ • ■ ; Xd denotes an appropriate eigenbasis of p. We will call S the set 
of ordered spectra and s the canonical projection onto S. Let us summarize the 
discussion up to now in the following definition. 

Definition 2.1 Consider a finite dimensional Hilbert space Ti, the correspond- 
ing set S of density operators, and a function p : S ~* X taking its values in 
the locally compact, separable metric space X. A sequence (i?Ar)jveN of POV 
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measures ■ B{Ti®^) is called a p-estimation scheme (or just an 

estimation scheme if there is no danger of confusion). A p-estimation scheme is 
called consistent, if the sequence {fJ,N,p)Nen of probability measures defined in 
10) converges for each p G S weakly to a point measure concentrated at p{p) S X . 

We recover both cases we are mainly interested in if we set X = S and p — Id 
for the fuU problem and X = and p = s for spectral estimation. 

Of special importance in this work are estimation scheme with additional 
symmetry properties: Let us denote the permutation group on N points by Sjv 
and its natural representation on 7i®^ by V, i.e. 

V^lpl ■ ■ ■ (^IpN = i^a-^l) <^ ■ ■ ■ <^'4'a-i{N), Cr e Sat, -01, . . . j-f/jAT e (5) 

An estimation scheme (En)ni£N is called permutation invariant, if 

V^ENiA)v; = ENiA) Vcr e Sjv VA e Q5(X) (6) 

holds. Likewise, it is called \]{d)-covariant (or just covariant) if U(d) acts con- 
tinuously on X by \J(d) x X 5 {U,x) i-^ au{x) G X such that the conditions 

U^^ENiA)U^^* = EN{auiA)) VJ7 G U(d) VA e *B(X) (7) 

and 

p{UpU*) ^au{p{p)) Vt/ e U(d) Vp e 5 (8) 

are satisfied. If the scheme (-EAr)jv6N is consistent, covariance of the projection 
p (jHJ is implied by covariance of the measures E^f ((JJ). Furthermore, note that 
the U((i) operation au is uniquely determined (if it exists) due to surjectivity 
of p. For full estimation we have au{p) = UpU* and for spectral estimation it 
is the trivial action, i.e. auix) = x. Hence, covariant estimation schemes are 
defined in both cases we are interested in. 

2.2 Large deviations 

Consider now, a Borel set A C X and a state p G S such that p{p) ^ A 
(the closure of A). The quantity /ip.jv(A) is then the probability to get a false 
estimate in A. If the scheme is consistent this probability goes to zero. This is, 
however, a very weak statement because the convergence can be very slow. As 
already pointed out in the introduction, we are therefore interested in schemes, 
where convergence of error probabilities to zero is exponentially fast; in other 
words for each p G S the sequence {iJ,N,p)NeN of probability measures from 
Equation (jS)) should satisfy the large deviation principle^ with a rate function 
I{p, ■ ). This idea leads to the following definition: 

Definition 2.2 A p-estimation scheme {EN)N^jq, as described in Definition 
\2.1\ satisfies the large deviation principle (LDP) with rate function I : S x X —> 
[0,oo] if 

1. Ip = I{p, ■) is a rate function (cf. Definition \A.l\) for each p G S. 

2. I{p,x) — iff p{p) — X holds. 

short summary of definitions and theorems from large deviations theory which are 
relevant for this paper can be found in Appendix 1X1 
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3. The sequence {nN.p)Ne^ of probability measures satisfies for each p E S 
the large deviation principle with rate function I p. 

Note that condition |21 guarantees that each scheme which satisfies the LDP 
is consistent, because the ijlm,p{^) converge to 0, if A is a closed set which does 
no contain p{p). Occasionally we will have to refer to the rate function / of an 
estimation scheme (£'Ar)jv6N without using {EM)N&i directly. In this case we 
will call / an admissible rate function. 

Definition 2.3 A function I : SxX ^ [0, oo] which is the rate function of a p- 
estimation scheme is called p-admissible (or just admissible if p is understood). 
The set of all p- admissible rate functions is denoted by £{p). 

We do not yet know how continuous or discontinuous admissible rate func- 
tions can be in their first argument. E.g. an otherwise very bad estimation 
scheme might provide very fast exponential decay for a particular input state. 
The discussion in Sections 14.11 and 15.41 will indicate that discontinuities might 
occur in particular at the boundary of the state space, while the behavior in the 
interior of S (i.e. at non-degenerate density matrices) seems to be more regular. 
To avoid such difficulties let us introduce the following subset of £{p): 

£'^{p) = {/ e £{p) I / is lower semi-continuous}. (9) 

If the map p we want to estimate is covariant in the sense of Equation JSJ we 
can introduce in addition 

E^ip) = {I e E{p) I / is covariant }, (10) 

where we call an admissible rate function covariant, if it is the rate function of a 
U((i)-covariant estimation scheme. In contrast to this, any function F : S xX 
[0,oo] is called \5(d) -invariant if 

F{UpU*, au{x)) = F{p, x) Vt/ £ \J{d) ^peS^xeX (11) 

is satisfied. Obviously, each admissible rate function which is covariant is V{d)- 
invariant too. It is not clear whether the converse holds as well (i.e. whether 
U((i)-invariance of / G £{p) implies covariance). However, problems can occur 
only on the boundary of S (i.e. for degenerate density matrices) and even there 
only if I is not lower semicontinuous (cf. Section for details). Finally note 
that U(d)-invariance of I € £'^(p) implies, together with lower semi-continuity 
of /p( • ) = I{p, ■ ), lower semi-continuity of J^( •)=/(•, x) along the orbits of 
the U((i) action on S. The general relation between £^[p) and £'^{p) is, however, 
not clear (i.e. / G £'^{p) can be discontinuous transversal to the orbits). 

Ideally, we would like to have estimation schemes (EN)NeN which provide 
the fastest possible exponential decay of error probabilities. Hence, for a given 
map p : S X we are mainly interested in the quantities 

Ip{p,a)^ sup I{p,a), Ip{p,cr)^ sup I{p,cr) (12) 

ie£{p) ie£°{p) 

and 

2'p(/3,ct)= sup I{p,a). (13) 
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The functions : S x X ^ [0, oo] thus defined (foUowing the notation intro- 
duced above, we will write for full and Xf" for spectral estimation), are the 
least upper bounds on the sets £'^{p), but they are not necessarily admissible 
themselves. In slight abuse of language we will call them nevertheless the op- 
timal rate functions. If Xp can be realized as the rate function of a particular 
estimation scheme (i?Ar)ArgN, we will call (£'Ar)jveN (strongly) asymptotically 
optimal. 

3 Summary of main results 

A particular example for asymptotic optimality arises in classical estimation 
theory (for finite probability distributions) . It is known from Bahadur efficiency 
O ^ |S] that the classical relative entropy is an upper bound for all admissible 
rate functions; and Sanov's theorem (cf. eg. JJ)^) states that this bound can 
be achieved by the empirical distribution (i.e. relative frequencies in a given 
sample). The latter provides therefore an asymptotically optimal estimation 
scheme. For quantum systems the situation is more difficult, and our knowledge 
is (unfortunately) not yet as complete as for classical estimation. Nevertheless, 
we have some significant partial results which we want to summarize in this 
section. The proofs and a more detailed discussion are postponed to Section 0] 
and[5l 

3.1 Estimating the spectrum 

The most complete result is available for spectral estimation. To state it let us 
recall the definition of the scheme presented in ■ It is based on the decompo- 
sition of the representation U i-^ [/®^ of the unitary group lJ{d) into irreducible 
components. The latter is given by 

H®^= Hy^ICy, V^"" = ^y(C/)(»I, (14) 

Y^yAN) YeyAN) 

where 3^d(iV) denotes the set of Young frames with d rows and N boxes 

d 

yd{N) = {r e N"^ I Yi > • • • > >d, ^r, = n}, (15) 

TTy denotes the irreducible representation with highest weight"^ Y , and /Cy is a 
multiplicity space which carries an irreducible representation of the symmetric 
group Sat on elements: 

i®ny(fT), aeSAT (16) 

where Va is defined in Equation ^ and Ily is the irreducible Sat representation 
defined by the Young frame Y . 

^ More precisely the Y\,. . . are the components of the highest weight in a particular 
basis of the Cartan subalgebra. 
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Now we can define a spectral estimation scheme (JAr)ArgN by 

F^{A)= Py, (17) 

Y/NeA 

where Py denotes the projection onto Hy ^ K,y: 

Py e B{n®^), P^ = Py, P^ = Py, PyU®^ ^Uy® JCy . (18) 

In other words Pat is a discrete measure with normalized Young frames Y/N as 
possible estimates and the probabiUty to get the outcome Y/N for input systems 
in the joint state p'^^ is tr(/3®^Py). In it is shown that Pjv satisfies the large 
deviation principle with the classical relative entropy between the probability 
vectors x G S and s{p) as the rate function I{p, x). As we will see in Subsection 
15.21 this is in fact the best that can be achieved (cf. also |2(Jj). 

Theorem 3.1 The spectral estimation scheme (PAr)AreN defined in is 
asymptotically optimal; i.e. it satisfies the LDP with the optimal rate function 
Ts defined in Equation ^1^) . In addition Ig = = 1g holds, and Ig is given 
explicitly by 

d 

S X 3 {p,x) v^Xs{p,x) ^ Yxj\hi{xj) -ln(sj(p))]. (19) 
where s : 5 — + S is the canonical projection from Equation 

3.2 The full density matrix 

For the full problem the best scheme {EM)N&i we have found so far is defined 
by the integral (with an arbitrary continuous function / : 5 ^ M) 

/ f{p)EN{dp) = 

Js 

dimHy f fiUpyf^U*) \7ry{U)cl)y){7ry{U)<l>y\(E>ldU, (20) 

where (j)y € Ti.y is the highest weight vector of the irreducible representation 
TTy and px denotes for each x € S the diagonal density matrix 

Px^diag{xi,...,Xd). (21) 

The main properties of this scheme are: It projects to the spectral estimation 
scheme P/v from Subsection l3.il 

P^(s-i(A)) =Pw(A) VA €»(£), (22) 

it is covariant (i.e. Equation {Tj) holds with au{p) — UpU*) and permutation 
invariant (cf. Equation ©). Measuring Ej\f can be regarded therefore as a two 
step process: First measure the observable P/v in terms of the instrument T, 
which is defined by the family of channels (given in the Schrodinger picture): 

Ty:B{n'^'')3uj^tTK^iPyLuPy)eB{ny) Yey^iN), (23) 
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where tTKy denotes the partial trace over ICy and the Py are again the pro- 
jections from (|18() . If the estimate for the spectrum we get in this way (with 
probability tr(Pyp'^^)) is Y/N, the output of T is a quantum system (described 
by the Hilbert space Hy - hence of different type then the input system^) in 
the state tr{Py p'^^)~^Ty{p'^^). On this system we perform a measurement of 
a covariant observable Ey with values in Sy = s~^{Y/N) which is defined by 
the integral 

/ f{a)Ey{da)^ f f(Upy,NU*)\ny{U)c^y){ny{U)cl>y\dU, (24) 
JSy Jv{d) 

(where / denotes now a continuous function on Sy) and this gives us an estimate 
for the eigenvectors of p. In the special case of pure states (i.e. if the first 
measurement gives Y/N = (1,0,0,..., 0)) the observable Ey is given by 

/ fia)Ey{da)^ f /(a)a«^, for F = (iV, 0, . . . , 0), (25) 
Jv Jv 

where V = s~^(l, 0, . . . , 0) denotes the set of pure states. This observable is 
known to optimize for each N global quality criteria like averaged fidelity |^ 
ISniEl- Hence we can look at Em as a direct generalization of the best known 
estimation schemes for the spectrum and for pure states. We discuss this point 
of view in greater detail in Section ^31 The large deviation behavior of En is 
described by the following theorem (cf. Section H?5l for a proof): 

Theorem 3.2 The full estimation scheme {EN)N&i defined in Equation \2(]\) 
satisfies the large deviation principle with rate function / : 5 x 5 ^ [0, oo\ 

d 

i{p, Up^U*) ^ (^fc l^(-^fc) - (^fc - ^fc+i) In [pmfe ([/>;/)] ) (26) 
fc=i 

where x = {xi, . . . , x^) G S, Xd+i — 0, px is the density matrix from Equation 
\21]) . U € U((i), and pmj{a) denotes the principal minor (i.e. the upper left 
rank j subdeterminant) of the matrix a. 

The best upper bound on the rate function for full estimation schemes we 
have found so far is derived from quantum hypothesis testing. 

Theorem 3.3 Each admissible rate function I : S x S [0, oo] is bounded 
from above by the relative entropy, i.e. 

Hp, c^) < S{p, cr) = tr(o- In(cr) - a ln(p)) Vp, cr G 5. (27) 

The proof will be given in Section IF^ cf also (SOj. It is easy to check nu- 
merically that I(p,a) and S{p,a) do not coincide in general. If we consider in 
particular the qubit case {d = 2) and express the density operators p, a in Bloch 
form, i.e. 

p=i[l-ff.a], a-i[l + y.a] (28) 

^If d = 2 holds the situation is special. In this case the output of T can be regarded as 
an M = Yi — Y'2 qubit system, and T itself coincides with the "natural purifier" studied in 
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(where a = (cri, (T2, era) are the Pauh matrices and j/G M"^ with |x|, |j7| < 1), 
we get for the rate function / from Equation (|26() 



iip,a) = -Sia)-\y\ln 



1 + \x\ cost 



i-\y\ 



hi 



(29) 



where 9 denotes the angle between x and y and S{a) is the von Neumann 
entropy of a. The relative entropy of a and p becomes j2] 



Sip,a) = -S{a)--Hl 



1 



(30) 



We have plotted both quantities as functions of 9 for two different values of 
\x\ = \y\ in Figure^ which shows that I{p, a) is in general strictly smaller than 

S{p, cr). 



3.3 Optimal rate functions 

Hence, for a general input state p we only know for sure that the optimal rate 
functions defined in Equation l(T^ and ((T^ have to satisfy (with p = ld for full 
estimation) 

i <Iu,^?d<^id<S. (31) 

This is, however, not as bad as it looks like at a first glance: Since S{p, a) and 
I{p, a) coincide if p and cr commute, we get 

i{p, cr) = X£j(p, cr) = Xi°i(p, cr) = Xid(p, cr) = S{p, cr) = 
d 

Sj{a) (in Sj(cr) — \nsj{p)) Vp, a E S with [p, a] = 0. (32) 

A second partial result arises if the input state is pure. In Proposition 15.51 we 
will show 

If(j(p, cr) — I{p, cr) Vp, cr e 5 with p pure, (33) 

and in Section 14.41 we will give some heuristic arguments which indicate that 
/ and X"^ coincide even for general input states. This indicates that (i?jv)ArgN 
is the best scheme as long as we are insisting on some additional regularity 
conditions of the rate function - in the case at hand this is covariance. It is not 
clear, however, whether covariance can be replaced by something more general 
without breaking the equality with /. There are at least some indications (cf. 
Section 1^31) that Equation H33|l would still hold if we replace If^ with Xjjj. Note 
that / G £^{p) hence (j33|l already implies X"(p, cr) > I'^{p.a) for pure p. Our 
conjecture here is that equality holds for all p and a. 

Another result which can be derived easily from Equation (|33|l and Propo- 
sition is S ^ f (Id), i.e. there is no estimation scheme with relative entropy 
as its rate function. This follows from the fact that S is lower semicontinuous 
and U(d)-invariant in the sense of Equation (|ll(l . Hence 5* G f (Id) would im- 
ply according to Proposition 14.91 S € £'^(Id) in contradiction to Equation l|33(l 
and the fact that S{p, a) > I{p, a) holds for all pure states p, a with p ^ o 
and pa 7^ 0. On the other hand there is strong evidence that — S holds, 
i.e. that S is the best upper bound of the set of all admissible rate functions. 
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Figure 1: Relative entropy and rate function 7 as a function of the angle B 
between the two Bloch vectors x and y. The upper plot corresponds to the case 
1^1 = 1^1 = 0-9 and the lower to = \y\ = 0.1. 
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This would imply that we can find for each pair pojCro G 5 an / G f (Id) such 
that /(pojtJo) = S{po,ao) holds, but / is much smaller than S (most probably 
even smaller than I) almost everywhere else. In Section l531 we will discuss these 
topics in greater detail. For now, let us summarize all our conjectures in the 
following Equation 

/ = Xfd=Iid<I=^- (34) 

4 Covariant observables 

The aim of this section is to study estimation schemes which are U(d) covariant 
and permutation invariant, i.e. they do not prefer a special copy of the input 
state or a particular direction in the Hilbert space Ti.. Among a proof of Theorem 
13 .21 we will provide several general results, which are useful within the discussion 
of the questions raised in Section f3.3l Therefore only full estimation schemes are 
considered in this section (i.e. p = Id), but most of the results in Subsection l4.2l 
and l4.3l can be generalized quite easily to p-estimation schemes. Hp is sufficiently 
covariant. 

4.1 Continuity properties 

Let us start with some technical results concerning continuity and uniform con- 
vergence with respect to the original density matrix p. They will become crucial 
within the discussion of group averages in the next section. Some of them, how- 
ever, are quite interesting in their own right, and it is therefore reasonable to 
devote a whole subsection for them. 

Central subjects of this discussion will be integrals of the form 

hN{p, f) = ^^^ f ^'""^^"^ tr(p®^i?^(da)) , (35) 
J s 

where / denotes an arbitrary, real valued function on S. Quantities of this form 
usually appear in Varadhan's Theorem (cf. Theorem I A. 3|l . i.e. if the estimation 
scheme (i?Ar)ArgN satisfies the LDP with rate function / we have 

hm hN{p, f) = h{p, /) = inf (/(p, a) + /(a)). (36) 

N^oo cjGo 

If on the other hand {EM)Nen does not necessarily satisfy the LDP but H3f)|l 
holds for each / and a density matrix p, the sequence of probability measures 
tr(p'^^£'7v( ■ )) satisfies the Laplace principle fDefinition IA.4|I which is equiva- 
lent to the large deviation principle (Theorem IA.5|I . Hence the study of conver- 
gence properties of the hi\[{p, /) is a useful tool to prove that the LDP holds for 
a given estimation scheme. 

In this section we will discuss continuity of h with respect to p and uniformity 
of the convergence h (again with respect to p). The most crucial step in 

this direction is the following lemma. 

Lemma 4.1 Consider an estimation scheme (_Ejv)AreN satisfying the LDP with 
rate function I, an arbitrary continuous (real valued) function f and the func- 
tionals h]^,h defined in Equations \35\) and il5'6|) . 
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1. For each non- degenerate density matrix p G 5 and each sequence N 3 
N pn Cz S converging to p we have 



lim /iAr(pAr,/) = lim hN{pJ) = h{pj) (37) 

A/— »oo N—*oo 

2. If I is lower semicontinuous in both arguments, the lower bound 

limMhNipNj) > h{pj) (38) 

holds even for degenerate p. 

Proof. Let us consider part H first. In this case the proof mainly depends 
on the following lemma which allows us to represent one sequence as a convex 
combination of two others. 

(7) 

Lemma 4.2 Consider two sequences n 3 N ^ p'^' e S, j = 1,2 both con- 
verging to the same non- degenerate density matrix p Cz S. For each A € R with 
< A < 1 there exists an integer N\ G N and a third sequence N 3 N i—f S S 
such that 

P^' = Xp^N^ + (1 - A)(Tjv ViV > Na (39) 

holds. 

Proof. Let k, = mi\\^\\^i{(j)^ pcf)) and define 

(1-A)k , , 

Since p is non-degenerate, we have k > and therefore e > as well. Hence 
there is an TVa £ N such that (with (p and A e S(7Y)) 



sup 1(0, {p%^ - p)4>)\ < sup |tr((p(^) - p)A)\ (41) 

11011 = 1 \\A\\ = 1 



= \\p'A'-p\\i<e (42) 

holds for all N > N\ and for j — 1,2. In addition we see by the triangle 
inequality that 

sup |(</.,(pi^)-p^))0)| < 26 (43) 

11011 = 1 

holds as well for all N > N\. Now define 

r 1 /AA\ 

(the second equality follows from Equation l|40|l l and 

TN = -Sp'^N^ + (1 + (5)p^^ for N>Nx (45) 
(and (J TV G S arbitrary otherwise). Obviously tr(crAr) = 1 and 
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Hence 

p^' = Apj^) + (1 - A)(7jv yN>Nx (47) 

as stated. 

It only remains to show that crjv > (and therefore ctjv G 5) holds for all 
N > Nx. This follows from 

(0,aw</)) =~<5(^,p^V) + (l + 5)(0,pi^V> (48) 

> -2Se - S{cf>, p^V) + (1 + S){c^, p^V) (49) 

= -2de + (0, p^V) > -2<5e + (0, pcj)) - e (50) 
>-26e + K-e = -e{2S+l) + K = 0, (51) 

where we have used Equation H43|l in H49|l . Equation H42|l in H5()|l and the defi- 
nition of d (gUl in (jnU- n 

Now let us apply this lemma to p^"* = p and p^"* = pat for all G N. For 
each A e (0, 1) we get an A^a G N such that /iAr(p, /) = /ijv(Apjv + (1 — A)crjv, /) 
holds for all N > Nx. Hence 

lim hN{p,f)= lim /lAr(ApAr + (1 - A)crjv, /)• (52) 

N^oc N^oQ 

Using the definition oihN in we get: 

hN{\pN + (1 - A)ajv, /) = ^ In I^A^e-^'^^t""-/) + 

V A^-"(l - A)" / e-^/(^) ii{AM,nEN{da))\ , (53) 

n=l •'S j 

where Ajv.„ denotes the sum of all tensor products consisting of — n factors 
Ptv and n factors atq . We can rewrite this expression as 



h]^{\pN + (1 - A)crAr, /) = -InA + hN{pN, /) 
1 / " /I X\" 



ln(^l + e^''"(''-^)f]^(i^yY^e-^^(-)tr(A^,„i?A.(da))^ . (54) 



Since pN and crjv are density matrices, the operators Ajv,n are positive. Hence, 
the argument of the last logarithm in Equation H54|) is greater than one and the 
logarithm therefore positive. This implies: 

hNiPNj) > hN{XpN + (1 - A)(7jv, /) + ln(A), (55) 
and with Equation H52(l 

liminf hiq(p]^, /) > liminf h]^{\piq + (1 — )^)<Jn , f) + lii(A) = 

lim hN{pJ)+ln{X). (56) 

Since A G (0, 1) is arbitrary we get liminf at _^oo hiq{p]^, /) > limjv_+oo hN{p, /). 
The other inequality (i.e. limsupjY^,^^ /iAr(p7v, /) < limjv^oo ^jv(p, /)) can 
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be derived with the same argument, if we exchange the role of p and 
Pn (i.e. apply Lemma 14.21 to pj^^ = pat and p^' = p for all N g 
N). Hence lim^v^oo ^Jv(p7V, /) — liniAr^oo ^Af (p, /) as stated. The equality 
limjv^oo hN{p, /) = h{p, /) follows from Varadhan's Theorem f Theorem IA.3|I . 

Now consider statement |5| If p is degenerate, the method used above can 
not be applied. However, if the rate function / is sufficiently continuous, we can 
extend (parts of) the result derived for non-degenerate density matrices to the 
degenerate case. To this end we need the following lemma: 

Lemma 4.3 Consider a compact metric space {X, d) and a lower semicontin- 
uous function F : X x X ^ [c, oo\, c G M. The infimum F_{x) — inf^gx F{x, y) 
is lower semicontinuous as well. 

Proof. Due to lower semicontinuity of F, we find for each {x,y) E X x X and 
each e > a 6x.y > with 

d{x, x') < Sx,y, d{y, y') < S^.y F{x',y') > F{x, y) - e. (57) 

Since X is compact, each fixed x E X admits finitely many points yi, . . . ,yk G 
X such that the neighborhoods Uj — {y' £ X \ d{y' ,yj) < Sx.yj} overlap X. 
Now define S — Tohij S^.y^ > 0. For each x' satisfying d{x,x') < S and each 
y' & X there is a j = 1, . . . , fc with F{x' , y') > F{x, yj) — e. Hence F{x', y') > 
inf J, F{x, y) — e and we get 

d{x, x')<5^ F{x') = inf F{x', y') > inf F{x,y) - e. (58) 
y' V 

Since S > this shows that F_ is lower semicontinuous at x and since x is 
arbitrary the statement follows. □ 

Let us apply this lemma to F{p, a) — I{p, a) + /(d). Since / is lower semi- 
continuous by assumption we get for each e > a 5 > such that \\p' — p\\i < 5 
implies h{p' , f) > h{p, f) — e. Together with the convexity of the (5-ball around 
p this implies 

h{Xp+{l-\)p'J)>h{pJ)^e VAe(0,l). (59) 

If {pN)NeN is a sequence in S converging to p, the convex linear combinations 
Apjv + (1 ~ ^)p' converges to Ap + (1 — A)p'. As in Equation 1)56(1 we get 

liminf /iAr(ApAr + (1 - A)p',/) < liminf /iAr(pAr, /) - ln(A). (60) 

AT— >oo N^Qo 

Now assume without loss of generality that p' is non-degenerate. Then Ap+ (1 — 
A)p' is non-degenerate as well and we have according to item^ 

liminf IiNiXpN + (1 - A)p', /) = hiXp + (1 - A)p', /) > /i(p, /) - e. (61) 

TV— *oo 

Hence 

lim inf /lAT (pjv , /) > /i(p, /) - e + ln( A) . (62) 

Since e > and A e (0, 1) are arbitrary the statement follows. □ 

According to Proposition 1.2.7 of 13' this lemma implies immediately that 
the convergence /ijv — > h is uniform on each compact set of non-degenerate 
density matrices. 
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Proposition 4.4 Consider the same assumptions as in the preceding lemma 
and a compact set K <Z S consisting only of non-degenerate density matrices. 
Then the convergence hjq h is uniform on K, i.e. 



lim supp^K\hN{p,f) - Hp, f)\:^0 (63) 



holds. 



Another simple consequence of Lemma k.ll is the contmuity of /i( • , /) on the 
interior of S. The proof is again omitted, since it can be taken without change 
from (first paragraph of the proof of Proposition 1.2.7). 

Proposition 4.5 Consider again the assumptions from Lemma \4-.1\ The func- 
tion S 3 p ^ h(p, f) £W is continuous at each non-degenerate p. 

This is a somewhat surprising result, because it is derived without any fur- 
ther assumption on the rate function /. Although it does not imply that I{p, a) 
is continuous in p, it shows at least that the dependence of / on the original 
density matrix p is quite regular on the interior of the state space S. On the 
boundary, however, nothing can be said. The discussion in Sections 15.31 and 
15.41 will indicate that this is probably a fundamental aspect of admissible rate 
functions and not just a problem of the methods used in the proofs. 

Let us consider now the natural action of U(c?) on the set C{S) of continuous 
functions on S, i.e. for each U G U(c?) and each / £ C{S) define ajjf G C{S) by 

aufi^) = fiUaU*). (64) 

Then we can consider for each fixed p £ S and each / the functions 

U(d) 3U^ hN{U*pU, auf) e R and U(d) 3U ^ h{U*pU, auf) G M, (65) 

and pose the same question as above - but now considering the dependency on 
U rather than on p. The following is the analog of Lemma ITTl 

Lemma 4.6 Consider an estimation scheme {Eiq)N!3i satisfying the LDP with 
rate function I, an arbitrary continuous (real valued) function f and the func- 
tional ft. AT, h defined in Equations and iV6)) . 

1. For each non- degenerate density matrix p £ S and each sequence N 3 
N l—^ Un G U((i) converging to U £ V{d) we have 

lim hNiU^pUN,auKf) ^ lim h^iU* pU,au f) ^ h{U* pU,au f) (66) 

2. If I is lower semicontinuous in both arguments, the lower hound 

limini hNiU'^pUN,aur,f) > HU*pU,auf) (67) 

N—^OQ 

holds even for degenerate p. 

Proof. To prove item ^ let us start with the observation that the function 
sequence (a(7„/)MeN converges uniformly to auf - Due to compactness of S 
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the function / is not just continuous but even uniformly continuous, i.e. for 
each e > there is a (5 > with 

Iki - (T2II1 < S \f{ai) - f{a2)\ < e. (68) 

Convergence of {UM)Meti implies the existence of G N with AI > ^ 
\\Um ~U\\ < 5/2. For each a and each M > we therefore get 

\\UmcjU*m - UaU*\\i < WUmctUIi - Um'jU*\\i + WUm^U* - UaU*\\i (69) 

< \\u;j - c/*iii|c/M|||kiii + \\Um - c/||||c/*iii|(t||i < s, 

(70) 

which implies together with (|68|l for an arbitrary a and M > 

\au,,f{a) - auf{<j)\ - IfiUM^U^j) - f{U<jU*)\ < e. (71) 

In other words the convergence auj^jf —^ ajjf is uniform as stated (since 
does not depend on a). 

To proceed, it is necessary to consider the following simple properties of the 
functionals /i^r and h: If /, /i denotes continuous functions on S and e G R we 
have for all p 

f>fi^ hN{p, /) > hN{p, fi), and h^ip, / + e) - h^ip, f) + e, (72) 

and from Lemma f4. II we already know that for all e > and all / there is an 
N[eJ] e N with 

N > N[e, /] ^ \hNiU*NpUN,.f) " h{U*pU, /)| < e. (73) 

Uniform convergence auMf ~^ ctuf implies that ajjf — e < uumJ ^ otu f + ^ 
holds for all M > M^. Hence for all iV g N we have 

hNiU^pUN,auf) - e < hN{U^pUN,auM f) < hN{U^pUN,auf) + e (74) 
according to H72|l . Together with H73|l we get 
N > N[e,aufl M > M, \hN{U*NpUN,au,J) - h{U*pU,auf)\ < 2e, (75) 

which implies Equation (|66|l . 

Statement |21 can be shown in the same way, if we replace Equation (|73(l by 
(cf. Lemma ^31 

N > N[e, f] ^ hN{U*^pUN, /) > h{U*pU, f) - e (76) 

and use only the lower bound of (|74|l . □ 

As in the case of Lemma 14.11 we can now derive continuity and uniformity 
properties from this result. The following proposition is (again) an immediate 
consequence of ^3 Prop. 1.2.7]. The proof is therefore omitted. 

Proposition 4.7 Consider the same assumptions as in Lemma \Jj)[ and a non- 
degenerate density matrix p. 
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1. The function 

U(d) 3U^ h{U*pU, auf) = inf {l{U*pU, U*aU) + /(a)) G M (77) 
is continuous. 

2. The convergence of h^lU* pU, au f) to h(U* pU, ajj f) is uniform in U, i.e. 

lim sup \hNiU*pU, auf) - h{U*pU, auf)\ = (78) 

holds. 

4.2 Averaging 

Let us consider now the question whether covariance and permutation invari- 
ance are "harmful" for the rate function; i.e. can we hope to exhaust the optimal 
upper bounds from Equation 1)12(1 with schemes admitting these symmetry prop- 
erties? One possible way to answer this question is to start with a general scheme 
(i?Ar)ArgN and to average over the unitary and the permutation group. For the 
latter this leads to 

' peSiv 

and since we have 

iv{p®''VpEN{A)V;) = tr(y;p®^Fpi?w(A)) = tr(p®^i?jv(A)) (80) 

for each permutation p e S^v, we see that the rate function is not changed at 
all by this procedure. Hence, for the rest of this section we can assume without 
loss of generality that each scheme is permutation invariant. 
This leads us to averages over the unitary group, i.e. 

En{1^) = / U^^EN{U*AU)U^^*dU. (81) 

Here the situation is (unfortunately) different. The following proposition shows 
that the convergence behavior of En is in general worse than that of E^. 

Proposition 4.8 Consider an estimation scheme {En)n£N satisfying the LDP 
with rate function I and the corresponding averaged scheme (i?jv)7VeN from 
Equation ISltl . For each non- degenerate density matrix p the sequence of proba- 
bility measures tr(p^^i?jv( • )) satisfies the LDP with rate function Ip given by 

7p(cr) =7(p,cr) inf I{U*pU,U*aU). (82) 

ueu{d) 

Proof. It is sufficient to show that the measures tr(p'*^i?jv( • )) satisfy the 
Laplace principle with the same rate function (cf. Theorem IA.5|I . because the 
Laplace principle is equivalent to the large deviation principle. Hence we have 
to show that 

lim -i In / e-^^» tr(p«^E^(da)) ^ inf (/(a) + /(p, a)) (83) 
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holds for all continuous functions / on S. Inserting the definition of Epj we get 
e-^/(^) tr(p®^Ejv(da)) 

-NfiUaU')t^^^lj*pUf^ENida))dU, (84) 



'U(d) JS 

or with the notation from Subsection 14.11 fcf. Equations (|35|l and l|64(l ^ 

e-^f^''hr{p'^^EN{da)) ^ [ e'^'^"^^' p^^''" dU. (85) 

5 Jv{d) 

According to Proposition the quantity hf^{U* pU,aij f) converges uniformly 
in U to h{U*pU, auf), i.e. for each e > there is an iV^ G N such that 

N> N^=^ 

h{U*pU,auf) + e>hNiU*pU,auf)>h{U*pU,auf)-e VC/ G U(d) (86) 
holds. Hence, for each e > we get 

limsup ^ In / e-^(MC/*pC/.«^/)+^)d[7 > 

limsup-iln / e''^^'"'-^^ p^'°"' ^Uu. (87) 

From Proposition 14.71 we know that h[U* pU,aij f) is continuous in U and we 
can apply Varadhan's Theorem (Theorem I A. 3(1 to the left hand side of this 
inequality. Together with 

inf h(U*pU,auf)= inf M (l{U* pU, a) + fiUaU*)) (88) 

= inf M (l{U*pU,U*(7U) + f(a)) (89) 

= M(l{p,a) + f{a)) (90) 

this implies the upper bound 

limsup ^ In / e-^'*"(^*''^>"^^)dC/ < inf (7(p, a) + /(a)) + e. (91) 



The lower bound 



liminf — In / e^^'^^^^'^^'^^^^df/ > inf ll{p, a) + f{a)) - e (92) 

can be shown in the same way. Since e > was arbitrary, Equation (|83|l follows 
from l|nU and l(^ . which concludes the proof. □ 

Hence, the best we can hope is that the averaged scheme satisfies the LDP 
with rate function / which is actually the worst U(d)-invariant rate function 
which can be derived from /. Only if / is U(d) invariant itself (such that 
1 = 1 holds), the convergence behavior of {En)ng'm is as good as than that 
of (£'7v)AreN- The following proposition shows that at least in this case the con- 
vergence problems on the boundary of S can be solved. 
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Proposition 4.9 If {EN)NeN is an estimation scheme satisfying the LDP with 
a \J{d) -invariant, lower semicontinuous (in both arguments) rate function I, the 
averaged scheme (i?jv)A'eN defined in Equation \81\) satisfies the LDP with the 
same rate function. 

Proof. We will show again the alternative statement that the sequence 
tr(p'^^i?jv( • )) satisfies the Laplace principle, i.e. Equation H83(l holds for all 
continuous real valued functions / and with / replaced by /. As in the last 
proof we can rewrite this in terms of the functionals h^ and h defined in Equa- 
tion 1)35(1 and H36|l. i.e. we have to show that (cf. Equation ((85|l ) 

limsup-iln / e-'^'"'^^'p^'°'"^^dU < h{p, f) (93) 



and 

liminf^ln / e'^^"^^' p^^""" ^Uu > h{pj) (94) 

hold. But now the convergence of hpf{U*pU,auf) to h{p, f) is only known to 
be pointwise (and not necessarily uniform) in U. Therefore, we can not proceed 
as in Proposition 14. 81 Instead we will use different strategies for the upper and 
the lower bound. 

To get the upper bound note that / is (as a continuous function on a compact 
set) bounded from above by a constant K > 0. Therefore the functions U i-^ 
h]y{U* pU,auf) are bounded as well (by the same constant) and we get (note 
that h{U*pU, ajjf) = h{p, f) holds for all U by assumption) 

lim j \hN{U*pU,auf)-h{pJ)\dU = 0. (95) 



N^oo 



U(d) 



by the dominated convergence theorem. Now let us introduce for each e > and 
each e N the set 

An,,^{U e\J{d)\\hN{U*pU,auf)~hipJ)\ >e}. (96) 

From Equation (|95|) we see that for each 6 > there is an A^^ e N such that 
N > Ns implies 

|A^,,|e< / \hN{U*pU,auf)-h{p,f)\dU <5, (97) 
Jv(d) 

where |Ajv,e| denotes the volume of Aat ^ with respect to the Haar measure 
(note that A^v.e is due to continuity of [/ i— )■ hN{U* pU, ajj f) open and therefore 
measurable). Now choose e > arbitrary and 5 — e/2 then we have for all 

N>Ns 

^-NhM{U'pU,auf)^U ^ I ^-Nht,{U'pU,auf)^u -^\^-N{HpJ)+e) ^ j-gg^ 
U((i) iu(d)\Ajv,c 2 

where we have used the fact that hN{U* pU,auf) < h{p,f) + e holds for all 
U ^ Ajv.e. Taking logarithms and the limit N ^ oo this implies 

lim sup -i In / e-'^'^'^^'p^-^^f^dU <h{pj) + e. (99) 
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Since e > was arbitrary we get the upper bound l|93|) . 
To prove the lower bound let us assume first that 



liminf ^M^^{hN{U*pU,auf) - h{pj)) > (100) 

does not hold. Then we can find a sequence {UN)NeN of unitaries with 

limmi{hN{U*^pUN,au^f)~hipJ)) < 0. (101) 

But due to compactness of V{d) we can assume without loss of generality that 
(Un) jvgn converges to a unitary U. Hence Equation H101|l contradicts Statement 
12 of Lemma 14.61 (since the rate function I is lower semicontinuous by assump- 
tion). Hence Equation 1)1 00|) is valid and we can find for each e > an iV^ G N 
such that N > implies 

hNiU*pU,auf )>h{pJ)-e VC/ e U(d). (102) 

Hence 

liminf — In / q-'^'"'^^' p^-""^ ^Uu > h{p, /) - e. (103) 

AT^oo N Ju(<i) 

Since e > is arbitrary we get the lower bound 1)94(1 and the proof is completed. 
□ 

This result is very useful if we want to check whether a given rate function 
is admissible or not. Many prominent candidates are U(ci) -invariant and lower 
semicontinuous (like relative entropy) , and in this case it is according to Propo- 
sition sufhcient to consider only covariant schemes. Important examples of 
functions which can be tested this way are the optimal rate functions Xu and 
X{Jj (for Xid this is true at least on the interior of S) : 

Proposition 4.10 The optimal rate functions Ti^ and T^^ are U(d) invariant 
(i.e. Equation 1^11]) holds with aijiu) = UcrU*). 

Proof. Since Xm and I^^ are defined as the upper bounds on £(Id) and £°(Id) 
we have to show that these sets are invariant under the operation / i— > /[/ 
with Iu{p,a) = I{UpU*,UaU*). Hence consider / S £(Id). Then there is a 
full estimation scheme (£'Ar)ArgN satisfying LDP with rate function /. For each 
fixed U e U(c?) we can define the translated scheme (£'^)ArgN with E^{A) = 
U'^^*Em{UMJ*)U^^ . If A is open we get 

liminf ^ In tr(p®^£;^(A)) =\njYini^\ntv{aj pU*)®^ EnOJ l^^U*)) (104) 

N-^oo N N^oo N 

<- inf I(UpU*,a) (105) 

~ aeUAU' 

= - inf I{UpU*, UaU*). (106) 

This shows that the large deviation upper bound holds with rate function Ijj. 
The lower bound can be shown in the same way. Hence {EY^)M£n satisfies the 
LDP with rate function Ijj, and this implies Ijj S £(Id). Since the operation 
I 1-^ Ijj respects semi-continuity of /, invariance of 5*^ (Id) is trivial and this 
concludes the proof. □ 
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Summarizing the discussion of this subsection wc can conclude that averag- 
ing is in the context of large deviations not as powerful as it is in other areas like 
optimal cloning. Nevertheless, it is not completely useless either. In particular 
the conjecture 1^^ £ £{p) is interesting in this regard, because it would imply 
that Xjj can be derived as the rate function of a covariant scheme. Hence, co- 
variant schemes are an important special case (and therefore worth studying), 
although they probably can not tell us the whole truth. 



4.3 General structure 

Now let us have a look at the general structure of covariant and permutation 
invariant estimation schemes. Our main tool is the following theorem about 
covariant observables j24| . 

Theorem 4.11 Consider a compact group G which acts transitively on a locally 
compact, separable metric space X by G x X 3 {g,x) i-^ ag{x), and a repre- 
sentation IT of G on a Hilbert space Ti.. Each POV measure E : S(X) — > B{Tl.) 
which is covariant (i.e. E{agA) — '!T[g)E(A)'!T{g)* for all A e ^(X) and all 
g (z G) has the form 



f{x)E{dx) - / f{agXoHg)Qi)A9Tl^{dg) (107) 

X JG 

where xq £ X is an (arbitrary) reference point, pi is the Haar-measure on G and 
Qo € B{T-L) a positive operator which is uniquely determined by \lU'l\j and the 
choice of xq . 

Unfortunately this theorem is not applicable to our case, because the action 
of U((i) on S is not transitive. A way out of this dilemma is to look at the 
fibration s : 5 — > E defined in Equation and to apply the results about 
transitive group actions to each fiber separately. (For the rest of this section we 
will use frequently the notations introduced in Section ITTI ^ 

Theorem 4.12 Each covariant and permutation invariant observable E : 
05(5) B{T-l®^) has the form (with a continuous function f on S) 

f{p)E{dp) = 



E 



^y{U) / f{Up^U*)qY{dx) ] 7rYiU*)dU 

U(d) \JS 



ly (108) 



with a sequence of (non-normalized) POV measures qy : 05(1]) S(7Yy), the 
diagonal matrices px — diag(a;i, . . . , Xd) from Equation \21}) and the unit matrix 
ly G S(/Cy). 

Proof. Permutation invariance implies immediately that 

En{A)= En,y{A)®^y (109) 
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holds with ly e B{JCy) and a family of POV measures En^y ■ ^{S) BiTiy), 
which are again U(rf) covariant: 

En,y{UI^U*) = TTY{U)EN{l^)TrY{U)* yUeVid). (110) 

Hence we only have to look at En^y for a fixed Y £ yd{N), Therefore the 
statement is a consequence of the following lemma. 

Lemma 4.13 Each\](d) covariant observable E : 5B(5) B{1-Ly) has the form 
[ f{p)E{dp) = / 7ry(C/) ( [ f{Up,U*)q{dx)) nY{U*)dU (111) 
with an appropriate POV-measure q : 25(S]) S(7Yy). 

Proof. To each p e G we can associate the stabilizer subgroup Gp — {U £ 
U(d) I UpU* = p} of U(c?), whose structure is uniquely determined by the de- 
generacy of the eigenvalues of p. Hence the set 

J = {G'px \ x eY.} with px = diag(a;i, ...,Xd) (112) 

is finite and for each p there is exactly one G ^ J such that Gp = UGU* holds 
with an appropriate unitary U G U(c?). We can decompose S therefore into a 
disjoint union S — Uce/'^G of finitely many subsets'* 

Sg^ {peS\3U e U(d) with Gp = UGU*}; (113) 

and similarly we have S = IJ^ Eg with Y,q = s{Sg)- By construction each 
orbit s^^{x), X G Sg is naturally homeomorphic to the homogeneous space 
Xq = \J{d)/G. Hence, there is a natural homeomorphism <i>G : Eg x Xq — > Sg 
which is uniquely determined by 

$g(^,[I]) = P. and$G(2:,[V]) = 1^P.V* V:e € Eg V[T/] € Xg- (114) 

Note that the crucial property of <i>G is to intertwine the group actions p t-^ 
UpU* and [V] ^ [UV] of U(d) on Sg and Xg respectively. 

The Sg are in general neither open nor closed, but they are Borel subsets of 
S (more precisely differentiable submanifolds with boundary) : Since s is contin- 
uous, it is obviously sufficient to show that Eg S ^(S) holds. But this follows 
from the fact that each Eg can be expressed as the complement of a Borel set 
in a finite union of closed sets (this is easy to see but tedious to write down). 
Sg G »(5) now impHes ^{Sg) = {A n 5g | A e *B(5)} C 53(5) and we can 
define the POV measures Eg : 23(5g) ^ bIhy), i^G(A) = ^;(A). Note that 
the Eg are not normalized and some of them can vanish completely. Since we 
can reconstruct E from the Eg by E{A) = J^g -^g(A n Sg) it is sufficient to 
prove the statement for each G separately. In addition we can use the homeo- 
morphism <I>G from Equation (|114|l to identify Sg with Eg x Xg and Eg with 
a POVM on S(Eg x Xq) which is covariant with respect to the group action 

Eg X Xg 9 {x, [V]) ^ ag(x, [V]) ^ (x, [UV]) G I^g ^ Xg (115) 

^The decomposition of S into a finite union of fiber bundles we are describing here is a 
special case of a much more general result ("slice theorem") about compact G-manifolds; cf. 
1^ . 
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of U(d), i.e. 

Ecia^A) = t:y{U)Eg{A)tty{U*) VA e Q5(I]g x Xq) VJJ e U(d). (116) 

This is a direct consequence of the intertwining property of $(3 mentioned above. 

Now let us consider the Abehan algebras C{Xg) and C(S]g) of continuous 
functions on Xq and S^. Each h G C(Sg) defines a positive linear map by 

C{XG)3k^ Ec.hik)^ [ h{x)k{y)EG{dxxdy)(,B{HY). (117) 

JSgxXg 

Positivity and linearity of Eg^h imply that it can be expressed as an integral 
over Xg with respect to a POV measure Eg^h 



EG,h{k) = / k{y)EG,h{dy) (118) 

JXg 

(this is a general property of positive maps on Abelian algebras; cf. [SS])- From 
(I116|l it follows immediately that Eg^h is covariant and we can apply Theorem 
14.111 i.e. there is a positive operator QgW such that 



EG.hik) = / k{[U])TrYiU)QG{h)TrY{U*)dU (119) 
Jv{d) 

holds. Note that the distinguished point Xo from Theorem 14. Ill is in our case 
[I] e Xg- Since QyW is uniquely defined by this equation (cf. Theorem I4.11|) 
we get another positive linear map Qg ■ C{T,g) 3 h i-^ Q{h) € B{'Hy) which 
can again be expressed as an integral 

QGih)= f h{x)qGidx), (120) 

and we get 

f{x,y)EG{dx X dy) = 



SgxXg 

^y{U)( f f{[U],x)qG{dx)]nY{U*)dU. (121) 

for each / of the form f{x,y) — k{x)h{y) with k G C(Sg), h E C{Xg), and by 
linearity and continuity for each continuous / on So x Xg- Now we can again 
apply the homeomorphism $g to map Eg back to a measure on Sg ■ Since ^g 
intertwines the action of lJ{d) on Sg and Sg x Xg we get from (|121|l 



f{p)EG{dp) = / tiy{U) / f{Up^U*)qG{dx) nY{U*)dU (122) 

5 JV{d) Vis / 

Hence the statement of the lemma follows with q{A) ~ J2g 1g{A. Sq). □ 

Together with the decomposition of E from Equation (|109|l the statement 
of this lemma concludes the proof of the theorem. □ 
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4.4 An explicit scheme 

The class of observables described in Theorem I4.12l is still quite big. To reduce 
the freedom of choice further we can focus our attention to estimation schemes 
which coincide with (_FV)ArgN from Theorem 13.11 as long as only information 
about the spectrum of p is required. In other words should satisfy for all 

iv e N 

En{s-\/^))^Fn[^) VAe<B(S), (123) 
This leads to the following corollary. 

Corollary 4.14 Each covariant and permutation invariant estimation scheme 
{EN)N&i which satisfies Equation \12!^) can be written as 

f f{p)Ej,{dp)= I f{UpY/NU*)U'^''{QY®m^''*dU, (124) 

with a family of operators Qy G B{TLy)- 

Proof. Equation H123() implies immediately that the POV measures qy from 
Proposition 14 . 1 21 are discrete, i.e. 

qy ^ ^ qyzSz/N (125) 
zeyd(N) 

where Sz/n denotes the Dirac measure at Z/N € E and qyz € B{Hy). Hence 
En becomes 

/ f{p)EN{dp) ^ I f{Upy/MU*)U^''QyU^''*dU, (126) 

with 

Qy= Yl IZY®^- (127) 
Using the definition of Fjv in Equation H17|l and again Equation (|123|) we get 

Py = Fn{{Y/N)) = En{s-^{Y/N)) = f U®^QyU®^*dU, (128) 

Jv(d) 

but this implies that Qy must be of the form qy ®\ with qy E B{Hy). Hence 
(|127|l implies qzy = for Y Z, which proves the corollary. □ 

Since the estimation scheme (Fn)ni£N is asymptotically optimal, condition 
(I123|l looks at a first glance very natural. In contrast to permutation invariance 
and covariance, however, we have no proof that it does not "harm" the rate 
function. In other words the crucial question is: Given a covariant and permu- 
tation invariant estimation scheme (i?jv)jveN satisfying LDP with rate function 
/, does there exist a scheme {En)^^^ which satisfies Equation 1)1231) and the 
LDP with a rate function / such that I < I holds? A possible strategy towards 
a proof might be to define Em by Equation (|124l) with Qy = qy{dx) and the 
POV measures qy which define Em according to Theorem 14. 121 The hard part 
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(which we haven't solved up to now) is of course to show that the rate function 
/ of such a scheme is at least as good as /. 

If we accept condition 11231) nevertheless, the estimation scheme (i?jv)AfeN 
arises from Corollarv l4.14l if we choose 

= dimHY|0y)((/.Y|, (129) 

where 4>y denotes the highest weight vector of the irreducible representation 
TTy. To see (heuristically) why this should be a good choice for the Qy, consider 
a nonsingular, diagonal density matrix p = with h — diag(/ii, . . . ,hd) and 
hi > ■ ■ ■ > hd- Since Ej^ projects to Fj^ we know already that we get an exact 
estimate for the spectrum of p in the limit N oo. To get a consistent scheme 
we need operators Qy such that the quantities 

tr {TTY{U*pU)Qy) dim/Cy = tr {{U* pU)'^'^ {Qy ® I)) (130) 

(regarded as densities along the orbits Sy = s~^{Y/N)) are more and more con- 
centrated on the density operators with the correct eigenvectors, i.e. to py/N- 
Since Y £ yd{N) is the highest weight of the irreducible representation Try 
and 4>y its highest weight vector, the highest eigenvalue of 7ry(p) is given by 
exp(^^ YjTij) and (j)y £ Hy is the corresponding eigenvector. All other eigen- 
values grow with a lower exponential rate (or decay faster, depending on the 
chosen normalization). The matrix element (0y, Try (p)0y) dominates therefore 
all other eigenvalues in the limit ^ oo. Hence the density (|130|l has the 
desired behavior if we choose Qy = |(/)y)(0y|. Note that the reasoning just 
sketched indicate that for any consistent scheme of the form H124|l the overlap 
of the Qy with |(/)y)(0y| should not decay too fast (at most polynomial). In the 
case of pure input state we will make this reasoning more precise; cf. Section 

El 

4.5 Proof of Theorem lOl 

Our next task is to prove Theorem 13. 21 i.e. we have to show that the estimation 
scheme En defined in Equation (|20|l satisfies the LDP with rate function / given 
in H26|l . The first step is to check that / is well defined. 

Lemma 4.15 There is a (unique) function I on S x S which satisfies 
i{p,UpxU*) = Y.j^i^j^'^i^j} - h{p,U,x) and 

d 

Ii{p,U,x) = ^(x, -x,+i)ln[pm^.(C/*pC/)] (131) 

where we have set Xd+i = 0. I is positive and I{p, a) ~ Q implies a = p. 

Proof. To prove that / is well defined we have to show that UiPxUl = 
U2PXU2 implies Ii{p,Ui,x) — Ii{p,U2,x). This is equivalent to [J7, p^] = 
Ii(p,U,x) = Ii{p, I, x). To exploit the relation [?7, pa;] = let us introduce 
k < d integers I — jo < ji < ■ ■ ■ < jk = d + I such that xj^ > Xj^_^-^ and 
Xj = Xj^ > holds for ja < j < ja+i and a < k. Then we have 

k 

h{p,U,x) = ^(a;,_i -x,Jln[pm^_,(t/>C/)]. (132) 

a=l 
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On the other hand [U, px] = imphes that U is block diagonal 

U = diag(J7o, . . . , Uk-i) with [/„ e V{da), da = ja+i - ja- (133) 

Hence we have pmj^_i(U* pU) — pnij^_]^(p) for all such U and all a with 
1 < a < fc. Together with Equation H132I) this shows that / is well defined. 

To prove positivity we have to show that inf[/ I{p, UpxU*) > holds for each 
p and X. Hence we have to minimize / (for fixed x and p) and since Xj > Xj+i 
this implies that we have to maximize the minors of U* pU . To this end let 
us denote the eigenvalues of p and the upper left j x j submatrix of U*pU 
by Ai > A2 > • • • > Ad respectively x[^^ > A^-'^ > • • • > A^''^ The minors of 
U*pU then become pm^- ([/>[/) = A^^^ • • • \^^\ According to [23 Thm 4.3.15] 

the A^"''' satisfy the constraint A^ > A^"*-* for all fc = 1, . . . , j, and this bound is 
(obviously) saturated if U*pU is diagonal in the preferred basis. Hence we get 
pnij{U* pU) < Xi ■ ■ ■ Xj and therefore 

d 

i{p, UpxU*) > ^3 Hxj) - ^=i{xj - Xj+i) ln(Ai • • • A,). (134) 
i=i 

Expanding the logarithms and reshuffling the second sum leads to 

d 

i{p,UpxU*) >Y,x,{\n{xj) - ln(A,)), (135) 

and equality holds iff p and a = UpxU* are simultaneously diagonalizable. Since 
the left hand side of this inequality is a relative entropy of classical probability 
distributions, we see that / is positive and I{a) = holds iS a ~ p. □ 

Now let us show that (£'7v)iVeN satisfies the LDP with rate function /. As in 
the proof of Proposition l4.9l we will do this by proving the equivalent statement 
that (En)n<£N satisfies the Laplace principle with the same rate function, i.e. 

hm ^ In / e-^^('^) tr{p^^ EN{da)) = inf (/(a) + /(p, <j)) (136) 

should hold for all continuous functions / on S. If we insert the definition of 
En, the integral on the left hand side becomes 

/ e-^/('^)tr(p®^^Ar(da)) = 

J2 diniT^y / e-^f^^P^^"^'hr{{U*pU)'^^\^Y){4'Y\(S>lY)dU, (137) 

where ly denotes the unit operator on /Cy . Now assume that p is non-degenerate 
(i.e. p € GL(c?, C)) then we can rewrite the density in this integral to 

tr{iU*pUf^\(f>Y){M ® ly) = tr(Py([/>{/)®^Py|0y)((/.y| (g, ly) (138) 

= dim/Cy tr(7ry(C/*pC/)|(/)y)(0y|) (139) 

= dim/Cy(0y,7ry([/*p;7)0y) (140) 
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where we have used in the second equation that Py {U* pU)^^ Py = Try{U*pU)^ 
ly holds. The matrix elements of TTy{U*pU) with respect to the highest weight 
vector can be expressed as (I5S1 § 49] or Sect. IX. 8]) 

d 

{(j>y,TTy{U*pU)q^y) = [] pm, (C/*p[/)^'= (141) 
k=l 

where we have set Yd+i = 0. The right hand side of this equation makes sense 
even if the exponents are not integer valued. We can rewrite therefore Equation 
(|137|l with the probability measure 

I h{x)vN{dx) ^ h{^)dim{Hy)dim{ICy) (142) 

to get 

e-^/('^)tr(p®^^jv(da)) (143) 



/ / d^e-^/(^''^^*'npmfc(t/>C/)^(^--^'^+i)d[7z.jv(dx) 

Jv(d) j,^^ 

(1' 

ex_p{-N[fiUp,U*) -Hd) - h{U,p,x)])dUiyN{dx) 



S J\J{d) 

(145) 

where 

d 

h{p,U,x) = J2i^k - a;fe+i)ln[pm,,(;7VC/)] (146) 

k=l 

is the function from Equation H131|l . Now we need the following Lemma 



Lemma 4.16 The probability measures i^n defined in Equation il4S^ satisfy 
the large deviation principle with rate function 

d 

Io{x)^Hd)+J2^jH^j)- (147) 

Proof. This follows immediately from Theorem 13. II with p ~ 2 1121 ). 
□ 

Obviously the product measure VN{dx) x dU satisfies the LDP with the 
same rate function. Moreover, the function in the argument of the exponential 
in Equation 11451) is continuous in x and U . Hence we can apply Varadhan's 
theorem to Equation H145() and get 

lim — In /" e-^i^''Ur{p®'^EN{da)) (148) 
= mUf{Up^U*) - ln(d) - h{U,p,x) + h{x)) (149) 

= -inf U{Up,U*) + Yl^xM^j)-h{U,p,x) \ , (150) 
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which proves Theorem 13 . 21 for non-degenerate density matrices. 

Now assume that p is degenerate and has rank r < d. By continuity in p, 
Equations 1] 140(1 and (|141l) imply that 



tr((t/V[/)®^|0y)(</>y| 1y) = dim/Cy Yl V^^kiU* pUY^-^'-^^ (151) 



k=l 



holds as in the non-degenerate case. The only difference is that the right hand 
side can vanish now, and it vanishes in particular for all Y with Yfe > for k > r 
(because all minors with k > r vanish for any U). Instead of p44|l we therefore 
get 

-NS(a) tr(p®^£;^(da)) 

= / / r^e~^^(^P-^') fTpmfe(C/*pC/)^("'=~"^+i)dC/i/iv,r(d:E) (152) 
JT,^ Jv(d) rr; 



fe=i 



with 

= {a; e S I Xfc = Vfc > r} (153) 

and 

f h{x)mAdx) = ^ J2 /»(^)dim(Wy)dim(/Cy). (154) 

Note that the difference between i>n and VN,r is just the summation over all 
Young frames with r rows instead of d rows. The right hand side of Equation 
()151|l can still vanish because the unitary matrix U is a. d x d matrix. Hence we 
can exclude 

M = {U e\Jid)\p-ai,.{U*pU) =0} (155) 

from the domain of integration without changing the value of the integral in 
(I152|l . Hence we get 

Nf{.) tr(p®^^^(da)) 

exv{^N[f{Up.,U*) -Hr) - h{U,p,x)])dUiyNAd^)- (156) 



JU{d)\M 

The domain x (U{d) \ M) is open in x \J{d) and Ii is continuous on it. 
Hence we can apply Varadhan's Theorem and proceed as in the non-degenerate 
case. 



5 Upper bounds 

In this section we will provide a detailed discussion of general upper bounds on 
admissible rate functions. This includes in particular the proofs of Theorems l3.1l 
and 13. 31 
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5.1 Hypothesis testing 

Let us start with a very brief review of some material from quantum hypothesis 
testing (for a detailed discussion cf. ^| |^ because it can be used to 

derive related results for estimation schemes. As in state estimation the task of 
hypothesis testing is to determine a state from measurements on N systems. 
In hypothesis testing, however, we know a priori that only a finite number of 
different states can occur. For our purposes it is sufficient to distinguish only 
between two states po,pi G S. This can be done by an observable of the N- 
fold system with values in the set {0, 1}, where we conclude from the outcome 
j G {0, 1} that the initial preparation was done according to pj. Mathematically 
such an observable is given by a positive operator An G B{H^^) with Ajm < I 
and ti^p'^^ An) is the probability to get the result during a measurement on 
N systems in the joint state pf^- Hence the two quantities 

aAr(^jv)=tr(p®^(I-^jv)), f3N{AN) = tT{pf^AN) (157) 

are error probabilities. More precisely aN{A]\[) is the probability to detect pi 
although the initial preparation was given by pf^ (error of the first kind) and 
Pn{An) is the probability for the converse situation (error of the second kind). 
Ideally we would like to have a test which minimizes and (3^- This is how- 
ever impossible because we can always reduce one quantity at the expense of 
the other. A possible solution of this problem is to make Pn{An) as small as 
possible under the constraint that aNiAjsi) remains bounded by some e > 0. 
The corresponding minimal (second kind) error probability is therefore 

/3^(e) = M{f3M{AN) I An E 6(H®^), < Am < 1, aN(AM) < e}. (158) 

Stein's Lemma describes the behavior of /3^(e) in the limit N oo; the quantum 
version is shown in 22i . 

Theorem 5.1 (Quantum Stein's Lemma) For any < e < 1 the equality 

hm lln/3^(e) = -5(pi,po) (159) 

AT— >oo iV 

holds. 

5.2 State estimation 

Let us consider now a (full) estimation scheme (EN)N^fi. One possibility to 
distinguish between two states p and a is to choose a neighborhood A G 25(5) 
of CT with p ^ A and to use the tests An — i?Ar(A). If {En)^^^^ is consistent, the 
corresponding first kind error probability a]\[{A]sf) vanishes in the limit N —^ oo 
and we can apply Stein's Lemma to get a bound on /3n{An) = tr(p®''^ii^Ar(A)) . 
Exploiting this idea more carefully leads to the following theorem. 

Theorem 5.2 Consider a continuous map p : S —f X onto a locally compact, 
separable metric space X . The optimal rate function Xp defined in Equation M'^] 
satisfies the inequality 

Ipip,x)< inf S{p,a) Vp e 5 Vx G X, (160) 
where S denotes the quantum relative entropy. 
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Proof. For each pair po,Pi of density operators with p{po) ^ p{pi) we can 
find a sequence of tests {An)^^^ by = Ej^(A) with an appropriate Borel 
set A C X. If A e is a neighborhood of p{po), consistency of {EN)Nef>! 

impUes that for all e > there is an A^^ S N such that 

aN{AN) = l-tr{EN{A)p^^) <e (161) 

holds for all N > N^. Hence Stein's Lemma implies 

limsup^ln/3jv(Aw) = limsup-ilntr(pf^Sjv(A)) < 5(pi,po)- (162) 

Now assume that the rate function / satisfies I(pi,xq) > S{pi,po) for some 
POiPi with p{pa) — xq and p{pi) ^ xq. Since /(pi, •) is lower semi-continuous 
we find a closed neighborhood A of xq such that 

Iipi,x)>S{pupo) + S Va:eA (163) 

holds for an appropriate S > 0. Hence the large deviation upper bound H215|) 
implies 

limsup4lntr(pf^i;A,(A)) < - inf I{pi,x) (164) 

liminf ^lntr(pf^i;w(A)) > inf I{pi,x) > Sipi,po) + S. (165) 

in contradiction to Equation (|162|l . Hence I{pi,xo) < S{pQ,pi) for all po with 
p{Pq) = Xq, which concludes the proof. □ 

Proof of Theorem \S.,'-A If we apply this theorem to full estimation schemes (i.e. 
X = S and p = Id) we get I{p, a) < S{p, a) Vp, cr g 5 and Theorem 13 . 31 follows 
as a simple corollary. □ 

Proof of Theorem For a spectral estimation schemes with rate function / 
Theorem 15.21 implies that I{p,x) < iids(^^)^.j; S{p, a) holds. But the infimum on 
the right hand side is achieved if cr and p commute and the eigenvalues in a 
joint eigenbasis are given in the same order. In this case we have 

d 

S{p,a) — '^^Xj {\nxj — Inrj) = S{x,r) (166) 

where s[a) = x = {xi, . . . ,Xd) and s{p) = r = (ri, . . . ,rd) denote the ordered 
spectra of a and p and S{r, x) is the classical relative entropy of the probability 
vectors r and x. Hence for spectral estimation the upper bound (|160|l becomes 

I{p, x) < S{s{p), x) ypeS V.X e S. (167) 

But from we know already that the scheme {F^ jNet-i defined in (|17|l satu- 
rates this bound; hence (-FV) tvgn is asymptotically optimal as stated in Theorem 
O □ 

If we are looking in particular at full estimation, the method used in the 
proof of Theorem l5.2l can be improved significantly. The following lemma, which 
expresses the rate function explicitly as a limit over a sequence of operators, is 
of great use in the next subsection. 
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Lemma 5.3 Consider a full estimation scheme {Ei^)]y^fq satisfying the LDP 
with rate function / : iSxiS —> [0, oo] and two states p,a ^ S. There is a sequence 
(A]v)ArgN of B Orel sets Ajv C S satisfying 



lim tr(cr^^£;Ar(AAr)) = 1 

lim -ilntr(p«^i?A.(AA.)) =/(p,a) 

N —>-oo i V 



(168) 
(169) 



and 

UAnU* = An VC/ e U(d) with [C/, cr] = 
Proof. For each /c S N consider the set 

Ak ^ {lu e S\\\a - Lu\\i < k-^} C S, 



(170) 



(171) 



which obviously has the symmetry property (|17()|l . Since the scheme {EN)N,=fi is 
consistent (since {Eiy)N^m satisfies the LDP this follows directly from Definition 
I2.2|l we have for each /c e N an index A^^. e N such that 



tr(a«^i?jv(Afe)) > 1 - ^ 
holds for all N > Nj,. In addition we get for each k E N 



N 



lim ^lntr(p«^^Ar(Afc)) = inf I{p,u:) 



(172) 



(173) 



by combining the large deviation upper and lower bounds. Hence for each fc G N 
there is an Nj! e N with 



-ilntr(p«^£;w(Afe)) - inf /(p,c^) 



1 



(174) 



for all N > N'f!. Now let us recursively define a strictly increasing sequence 
{Nk)k£N of integers by A^i = 1 and Nk ~ max{A^^, 7V^', A^fe-i + 1}, and set 



Ak ioY Nk < N < Nk+i. 



(175) 



For each N > Nk we therefore have an integer / > k with Ni < N < Nl-^-l and 
An = Ai . Since Ni < N implies in particular N > N^ we have due to (|172|l 

tr{a^^ ENiAN)) = tr(a«^i?jv(A,)) > 1 - y > 1 - ^. (176) 

and this implies Equation H168|l . Similarly we have N > Ni > N" and therefore 
with ifTT^ 



— lntr(p«^i?^(A^))- inf /(p,c 



— \niv{p'^'' EnCAi)) - inf /(p,c 
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Now note that the sequence {An)^^-^ forms a neighborhood base at cr G 5, 
more precisely 

oo 

Ajv+1 C Atv VAT e N and Q Aat = {a}. (178) 

Lower semi- continuity of /p( • ) = I{p, ■ ) imphes in addition that 

f/fc = V'((/p(fT)-fc-\H) (179) 

is for each fc e N an open neighborhood of <t. Hence we have a G N such 
that M > Mk imphes Am C C/fe and therefore 

I{p, a) > inf I{p, uj) > I{p, cr) - i VM > Mk- (180) 

wgAaj k 

Now assume that N > maxjA'^^, Mk} then we get with Equation H177|) 



— lntv{p^^EM{AM))-I{p,<j) 



< 



— \ntT{p^^EN{AN)) - inf Iip,u;) 

iV uj^An 



inf I{p,uj) - I{p,a) 



<- (181) 



and this imphes Equation (jl69|l . which concludes the proof. 



□ 



5.3 Pure states 

The main purpose of this section is to provide a proof of Equation H33|l , where 
we have claimed that / and Xf^ coincide for pure input states. This is basically 
quite simple. We will take, however, a small detour which allows us to have a 
closer look beyond the covariant case (Subsection 15. 4|) . 

Let us consider first a pure state p and a mixed state a. From Equation l|26(l 
we see immediately that this implies I{p, a) = oo. Since / is a lower bound on 
all we get 

Xid(p, cr) = Xid(p, cr) = luiP^ 0-) = Hp^ cr) oo Vp pure a mixed. (182) 

Hence only the case where p and a are both pure needs to be discussed. For 
the rest of this section we will assume (unless something different is explicitly 
stated) therefore that 

p= <j=\^){,p\ with en, 11011 = IIVII = 1 (183) 

holds. The rate function / then has the following simple structure: 

lip, a) = -lntr(pa) ^ . (184) 

Now we need the following lemma which shows that we can assume without 
loss of generality that the operators En { An) from Lemma 15.31 are rank one 
projectors. 
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Lemma 5.4 Consider an admissible rate function I G f (Id) and two pure states 
p = |</>)(0|, a = \tjj)['t]j\. There is a sequence {'^N)NeN of normalized vectors 
^jv G Tif^ (the symmetric subspace ofTiP^) such that 

Iiminf^ln(|(0®^,vl/^)P) >/(p,a) (185) 



and 



lim |(^'jv,V'® )l =1 (186) 



holds. If I is covariant we can choose ~ tp'^^ . 

Proof. Consider a full estimation scheme (i?7v)AfeN satisfying the LDP with 
rate function / and the sequence {An)^^^ of Borel sets Ajv C 5 from Lemma 
15.31 Since only the overlap of i?Ar(Ajv) with c/)^^ and tp^^ are of interest, we can 
assume without loss of generality that En{An) is supported by the symmetric 
tensor product Ti.f^. Now choose a < A < 1 and denote the spectral projector 
of i?7v(AAr) belonging to the interval [1 — A, 1] by Pjv.a- Obviously we have due 
to En{An) < I 

+ (1-A)(V®^,(I-P^,a)V'®^) (187) 

= (1-A) + A(7/^®^,Pjv,aVO- (188) 
Equation H168|l therefore implies 

lim (^/.®^,Pjv.aV®^) = 1 (189) 

Hence for each < (5 < 1 there is an Ns ^ N such that 

(V'®^,P7v,aV'®^) > l-(5 (190) 

holds for aU N > Ns. Now we define for N with P/v^aV"®^ (which due to 
Equation H190|l is true if N is large enough) 

and 5* AT arbitrary for all other N. Equation 1)1 90|) implies immediately (|186|l . 
The bound H185|l follows from 

(0«^,i?jv(A^)0«^) > (l-A)(</)®^,Pjv,A0®'^), (192) 

which in turn implies 

/(p,a)= lim ^ln(0«^,£;^(AAr)0«^) (193) 

< liminf -i ln(0®^, (1 - A)PAr,^0«^) (194) 

N^OG I\ 

= liminf -1 ln(^®^, Pjv,A</'®'^) (195) 

N^oo iV 

<liminf-iln(|(0«^,vl/^)|2) (196) 



33 



where we have used m the last equation that Pn,x^n ~ '^n and therefore 
Pn,\ > |^Af)(^'Ar| holds if N is large enough. 

Now assume that / is covariant. This implies by definition that we can choose 
(iJjv)jVGN to be covariant as well and we get according to Equation H17Q|I 

U'^^ENiAN)U^^* = ENiAN) yU e U(d) with [U, a] = 0. (197) 

Since Pn,\ is a spectral projector of En (An) we get W^^ Pn^xW^^* = Pn.x 
for the same set of U and since a = |'(/')(V'| this implies 

C/^'^Pw.aV''''^ = Pw,aC/®~V;®~ = Pjv^aV^®"^ hence C/^^*^ = -^n (198) 



for all U with Ui(j = and all '^n from Equation 119111 . It is easy to see that 
^jv ~ i^^^ is the only vector in 7i®^ with this property. □ 

With this lemma it is now very easy to determine 2f^ (p, • ) for pure input 
states p. As already stated in Section we get (cf. in this context the analysis 
of covariant pure state estimation in 19|) 

Proposition 5.5 For each pure state p and all a £ S the equality 

^r. , N N I oo if a is mixed 

XtMa)=I{p,a) = { ' . (199) 

I — intr(pcr) ij a is pure 

holds. 

Proof. Since / is covariant we have Tf^{p, c) > liP- for all p,a E S. If p is 
pure and a is mixed we have I(p,a) — oo and therefore T'^{p,a) — I(p,a). If 
both states are pure we get from Lemma l5.- 



I%p,a) < lim ^ln|(0®~,V^^)|2 = -Intr (pa) = lip, a) (200) 

which concludes the proof. □ 

Together with the arguments from Section 14.41 this result supports our con- 
jecture from Section rOl that If^ and I coincide also for mixed input states. 



5.4 Beyond covariance 

If we look at Equation (|186|l and compare it with the reasoning in the last 
proof we might think that covariance is not really needed here, because ^'^r 
converges to -0®^ in the limit — s- oo even without further assumptions on /. 
This impression, however, is wrong, because the vectors ip^'^ and become 
more and more orthogonal as N increases and therefore the part of ^'^v which 
is orthogonal to tp^^ can play a crucial role (although it vanishes in the limit 
N oo). The relation of the optimal rate functions Xm and 1°^ to / and 
relative entropy S needs therefore more discussion. Although we are not yet 
able to give complete results we will collect in the following some (informal) 
arguments which supports the two conjectures Xjd ~ S and = I from the 
end of Section rOl 

As in the last section we will consider only pure states, i.e. we will evaluate 
a rate fimction I{p,a) only for p — |0)(0| and a = In addition we will 
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assume that Ti is two-dimensional (this can be done without loss of generality, 
because we just have to replace Ti with the subspace generated by ij: and (p). 
Hence we can set 



^ = |0) and = = VP|0) + e'"^l^\l) (201) 

with < p < 1, a e (— tTjTt] and an arbitrary but fixed basis |0), |1) oiH. In the 
number basis |fc, N) e Uf^ , k = 0,...,N 

\k;N)^(] 5w|0)®(^^'=) 11)®'= (202) 

(where Sn is the projector to Tif^) the vectors 5' at G Hf^ from Lemma [5.41 
can then be written as 

N 
fc=0 

and becomes 

= = E fc x/p'^-'v^ ^^)- (204) 

k=Q ^ ^ 

Let us consider the conjecture Im = S first. In the case of pure states this 
would imply that we can find for each pair of pure states a po an admissible 
rate function / with I{pq, cr) = oo. A possible way to prove this could consist of 
two steps: 

• Step 1. Find a sequence (Ajv)AreN of operators such that 

lim — Intr(p^^AAr) = oo, lim tr(CT®^Ajv) = 1 (205) 

and 

lim -ilntr(p®^Ajv) = r(p) > Vp 7^ a (206) 

holds. 

• Step 2. Find a full estimation scheme (i?jv)AreN and a sequence (Ajv)iVGN 
of Borel sets Ajv C 5 shrinking to a such that En{A]\[) — An holds for 
all N eN. 

To implement the second step we would need a converse of Lemma 15. 31 and such 
a result is (unfortunately) not yet available. The problem here is not to construct 
some POV measures with En(An) = Ajv, but to construct them such that the 
resulting scheme satisfies the LDP (which includes in particular consistency). It 
seems, however, that this is more a technical then a fundamental problem. 

The first step is much easier to perform^. Assume that po = |0g,/3) {4>q,(3\ holds 
with (j>q,i3 from 120111 . Then we set An — \'^n){'^n\ and define ^'a? according 
to (1203) by 

fNfi = -NN^ ^/T^e'P, fN,i^MN./q (207) 

^However, it is not sufBcient to find a sequence of tests which saturates the bound from 
Stein's lemma, because Equation I2U6I would not necessarily hold in this case. 
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with the normalization 



AfN = {N{l-q) + q) (208) 
and /AT.fe = for aU fc > 1. Obviously we have 

{^NATa) = and lim /jv.o = 1 (209) 
which implies Equations (|2()5(l . On the other hand we get I'^{p) = —\ntr{pa) 



for each pure p ^ po and therefore Equation (|206|l holds as well. Hence there 
is strong evidence behind the conjecture Xu — S from Section 13.31 fat least for 
pure input states). 

The method used in the last paragraph can be easily generalized to construct 
a sequence of operators (Ajv)ArgN such that the function /'^ from H206|l becomes 
infinite at finitely many points or even on a countable dense subset of the space 
V of pure states. This is, however, not sufficient to disprove the conjecture 

~ I, because in this case we would need {An)^^^^ such that 1°^ becomes 
lower semi-continuous, /"'(po) > ~ lntr(/9oO') for one state po implies for such 
an /'^ that I'^ip) > — ln(/Ocr) holds for all p in a whole neighborhood of po in V. 
We will show in the following why it is (at least) very difficult to find a sequence 
{Apf)pf^fi with this special property. 

To this end consider An — |5'Ar)(^Ar| with 'Sat from Lemma l5^ and a fixed 
< p < 1 such that 

Jhn^^ln(|(*^,Ol') >"l^tr(|(V.,0,,„)P)=-lnp (210) 

holds for all a with — tt < a- < a < a+ < tt for some bounds a^,a-^-. To 
rewrite this in a more convenient way let us identify the interval (— 7r,7r] with 
the unit circle and consider the sequence (-FAr)NeN, -Fjv € L^(S'^) with 

Fn = \\Fn\\-^Fn, Fjv(a) = (^'jv,0- (^H) 

In the orthonormal basis (efe)feez, & L^(S'^), efc(a) — {2t:)~^^^ exp{ika) these 
vectors become 

FN{a) = fc V^'^-'V^ e''", (212) 

k=Q ^ ^ 

hence all Fn are elements of the positive frequency subspace 

H2(S'1) = span{efc | fc > 0} C L^{S^). (213) 

In addition we can conclude immediately from Equation (|186|l and |0, iV) = -0®^ 
the inequality 

Hence to get ()21()(l the functions Fjv have to converge pointwise and exponen- 
tially fast to on the interval (a_,Q;+). To find such a sequence is difficult due 
to the following lemma. 

Lemma 5.6 A function F G }i^{S^) which vanishes on a non-empty subinterval 
(a_,a-|-) of vanishes completely. 
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The proof of this lemma uses the fact that each smooth element of H^(S'^) is 
the boundary value of an analytic function on the unit disc (cf. |37| for details). 
For us it shows that the F]\[ can not vanish on (q!_,q;4.) because ||i^Ar|l = 1 
by construction. It is even impossible that the sequence (F/v)ArgN converges (in 
norm) to a function F G L^{S^), because this F would satisfy again ||i^|| = 1, 
F e H^(S'^) and F{a) = for all a £ (a_,a+). The only way out is to find a 
sequence which does not converge for all a. Such a series can be constructed if 
we allow infinitely fast oscillations in the limit N ^ oo (start with a sequence 
which converges in L^(S'^) and shift its elements to the positive frequency space). 
However, even then there are two additional requirements: 1. The vectors 
(and therefore the coefficients fN,k) have to satisfy the constraints ||^'Ar|| = 1 
and limTv-^oo I/n.oI — 1 and 2. limAr^oo ^Af(a) = must hold not only for all 
a E (a_,Q;+), but also for all p E {p^,p^) for some < p_ < p+ < 1. We 
have not yet succeeded to construct a sequence {^n)n£'n which satisfies all 
these condition, but what we can say already at this point is the following: If 
there is a rate function / e £°(Id) with I{p,a) > I{p,a) for some p, cr, then 
the corresponding estimation scheme must develop very irregular behavior with 
respect to relative phases and this indicates that a more detailed analysis of 
phase estimation might solve our problem. 



A Some material from large deviations theory 

The purpose of this appendix is to collect some material about large deviation 
theory which is used throughout this paper. For a more detailed presentation 
we refer the reader to monographs like jl4l 1131 [TH] . 

Definition A.l A function I : X [0,cxd] on a locally compact, separable, 
metric space X is called a rate function if 

1. I^oo 

2. I is lower semi-continuous. 

3. I has compact level sets, i.e. I~^(J—oo, c]) is compact for all c € K. 

Definition A. 2 Let {fJ-N)Neti, N E N be a sequence of probability measures 
on the Borel subsets of a locally compact, separable metric space X and I : 
X [0, 1] a rate function in the sense of DeRnition \A.ll We say that (/xjv)jveN 
satisfies the large deviation principle with rate function I : X [0, oo] if the 
following conditions hold: 

1. For each closed subset A C S liie have 

limsup^ln/XAr(A) < - inf /(a;) (215) 



2. For each open subset A C E we have 



liminf — lnu/v(A) > — inf I(x) 



(216) 
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The most relevant consequence of this definition is the following theorem of 
Varadhan |35| , which describes the behavior of some expectation values in the 
limit N oo: 

Theorem A. 3 (Varadhan) Consider a sequence {iin)n£N, iV g N of proba- 
bility measures on X satisfying the large deviation principle with rate function 
/ : X ^ [0, oo\ and a continuous function / : X — > M which is hounded from 
below. Then the following equality holds: 

lim ^In / e~^-^(^Vjv(f^a;) inf (/(a;)+/(a;)). (217) 

N^oo N J xeE 

Varadhan's theorem has a converse: If we know that a sequence of measures 
fiN satisfies Equation (|217|l for all bounded continuous functions it can be shown 
that the /ijv satisfy the large deviation principle as well. Following we have: 

Definition A. 4 Let (/iAr)jvGN be a sequence of measures on a locally compact, 
separable metric space X and I : X [0, oo] a rate function. We say that 
{fJ'N)Ni£N satisfy the Laplace principle with rate function I, if we have 

lim ^In / e-'^f^'='>fiN{dx)=- Mlf{x)+I{x)). (218) 

N^oo N J xGE 

for all bounded continuous functions / : _E ^ M. 

Theorem A. 5 The Laplace principle implies the large deviation principle with 
the same rate function. 
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